Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track parquet writer encoding memory usage on MemoryPool #11345

Merged
merged 10 commits into from
Jul 10, 2024

Conversation

wiedld
Copy link
Contributor

@wiedld wiedld commented Jul 9, 2024

Which issue does this PR close?

Closes #11344

Rationale for this change

ParquetSink can use non-trivial amounts of memory to buffer rowgroups prior to flush, when executed within a task context. Therefore, this memory usage should be accounted within the task's memory pool.

What changes are included in this PR?

Ensure memory accounting under three use cases for ParquetSink:

  • when non-parallel writes
  • when using multiple row-writers
  • when using multiple column-writers

How parallelized-write memory tracking works, is summarized here. I feel like it should be a doc comment, but I wasn't sure where. 🤔

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added the core Core DataFusion crate label Jul 9, 2024
@wiedld wiedld force-pushed the 11344/track-parquet-sink-memory branch from 518a7e9 to 073e471 Compare July 9, 2024 02:54
…n, before selecting shrinking for data bytes flushed
@wiedld wiedld marked this pull request as ready for review July 9, 2024 03:59
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wiedld

For anyone else reading this PR, we discussed this face to face and I think @wiedld has some ideas of how to make it simpler

@alamb alamb marked this pull request as draft July 9, 2024 19:27
@alamb
Copy link
Contributor

alamb commented Jul 9, 2024

Thanks @wiedld -- please mark this PR as ready for review when it is ready for another look

Comment on lines +343 to +344
// TODO: update error handling in ParquetSink
"Unable to send array to writer!",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallelized writes have vectors for channels, and vectors for the spawned tasks. This error we are hitting (on memory limit reached) is for the closed channel.

I believe we want to be surfacing errors for the tasks, which should exit due to the memory reservation error. Need to poke around a bit more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to update several map_err statements to propagate inner error messages rather than ignore them. E.g.

col_array_channels[next_channel]
.send(c)
.await
.map_err(|_| {
DataFusionError::Internal("Unable to send array to writer!".into())

change to something like

            col_array_channels[next_channel]
                .send(c)
                .await
                .map_err(|e| internal_datafusion_err!("Unable to send array to writer due to error {e}"))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #11397 to track

@alamb
Copy link
Contributor

alamb commented Jul 10, 2024

FYI @devinjdangelo

let (writer, col_reservation) = task.join_unwind().await?;
let encoded_size = writer.get_estimated_total_bytes();
rg_reservation.grow(encoded_size);
drop(col_reservation);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the explicit drop is needed -- it will be dropped automatically by the compiler when col_reservation goes out of scope on the next line

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wiedld -- this is looking quite close

@wiedld wiedld force-pushed the 11344/track-parquet-sink-memory branch from 46dc2d3 to 6ce964e Compare July 10, 2024 15:50
Copy link
Contributor

@devinjdangelo devinjdangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wiedld and @alamb this looks good to me! I left a comment on how I think we can improve the error messages.

Comment on lines +343 to +344
// TODO: update error handling in ParquetSink
"Unable to send array to writer!",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to update several map_err statements to propagate inner error messages rather than ignore them. E.g.

col_array_channels[next_channel]
.send(c)
.await
.map_err(|_| {
DataFusionError::Internal("Unable to send array to writer!".into())

change to something like

            col_array_channels[next_channel]
                .send(c)
                .await
                .map_err(|e| internal_datafusion_err!("Unable to send array to writer due to error {e}"))

@wiedld
Copy link
Contributor Author

wiedld commented Jul 10, 2024

Thanks @wiedld and @alamb this looks good to me! I left a comment on how I think we can improve the error messages.

Thank you @devinjdangelo ! Note that the error message propagated back is channel closed and not the resource exhuastion. I've been fiddling a bit with a solution, and I'll put up a PR (pinging you both) hopefully later today right after this merges.

@wiedld wiedld marked this pull request as ready for review July 10, 2024 16:58
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wiedld and @devinjdangelo -- I think this looks good to me now

I agree with @devinjdangelo it would be great to improve the error message propagation -- but I also think we could do that as a follow on PR.

Edit: looks like @wiedld plans to do as a follow on PR so I'll file a ticket and merge this one

Follow on ticket: #11397

@alamb alamb changed the title feat(11344): track parquet encoding memory usage Track parquet writer encoding memory usage on MemoryPool Jul 10, 2024
@alamb alamb merged commit 6038f4c into apache:main Jul 10, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 10, 2024

🚀

@alamb alamb deleted the 11344/track-parquet-sink-memory branch July 10, 2024 18:21
Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Jul 12, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
wiedld added a commit to influxdata/arrow-datafusion that referenced this pull request Jul 12, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
appletreeisyellow pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 12, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 17, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
appletreeisyellow pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 22, 2024
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Track memory used by parquet writers.
3 participants